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(57)Abstract: 

PROBLEM TO BE SOLVED: To provide a contents indexing retrieval 
system which filters and blocks contents by providing a means which 
implements the same contents blocking policy with a blocking engine 
by correcting a contents indexing retrieval engine. 
SOLUTION: A gateway 124 is used as an interface between a 
plurality of clients or an internal network 107 and the Internet 106. A 
proxy server having a cache and blocking proxy server 126 is usually 
inserted into the connection path from the internal network 107 to the 
Internet 106 to actualize the contents blocking policy, thereby 
increasing performance and management. The cache and blocking 
proxy server 126 can be connected to the gateway 124, or it can be 
connected directly on parallel to the internal network 107 and an 
external data transmission network 1 06. 
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* NOTICES* 

JPO and INPIT are not responsible for any 
damages caused by the use of this translation. 



1. This document has been translated by computer. So the translation may not reflect the original precisely. 

2. **** shows the word which can not be translated. 
3.ln the drawings, any words are not translated. 



CLAIMS 
[Claim(s)] 

[Claim 1]A contents indexing search system which provides search results which are in agreement with 
filtering of contents, and blocking restrictions, comprising: 
A contents indexing search engine containing a database. 
Cash and a blocking proxy server containing cash. 

An information network combined with said contents indexing search engine. 

A means for performing a search inquiry to said contents indexing search engine, and receiving search 
results from said cash, A blocking engine which combines with said contents indexing search engine, and 
performs filtering and a blocking policy of contents, A means for correcting said contents indexing search 
engine and performing the same contents blocking policy as said blocking engine. 

[Claim 2]The system according to claim 1 further provided with a means for performing said blocking policy 
by said contents indexing search engine during a contents indexing phase. 

[Claim 3]The system according to claim 1 further provided with a means for performing said blocking policy 
during an end user search-results display phase. 

[Claim 4]The system according to claim 1 further provided with a means to correct said contents indexing 
search engine in order to build an indexing database by carrying out the index of the cash contents. 
[Claim 5]The system according to claim 1 further provided with a means for correcting said contents 
indexing search engine in order to incorporate said cash and a blocking engine result, when said contents 
indexing search engine builds an indexing database. 

[Claim 6]A contents indexing search engine combined with a database and cash, An information network 
combined with said contents indexing search engine, In a contents indexing search system which has a 
blocking engine which performs filtering and blocking restrictions of contents to search results with which 
an end user is provided via said cash, It is the method of providing search results which are in agreement 
with filtering of contents, and implementation of blocking -, (a) A step to which arbitrary information site 
URL which changes a process of said contents indexing search engine, and is in agreement with an 
exclusion pattern is made to skip, (b) A step which searches only a site or a route contents source which is 
in agreement with an information site URL list which changes a process of said contents indexing search 
engine, and is permitted clearly, (c) having a step which performs a filtering policy defined with said cash 
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and a blocking engine by said contents indexing search engine -- a step of the above (C) - said filtering 
policy (i), whenever it is a fixed interval or change is detected, A step which reads a filtering rule of 
contents from said cash and a filtering engine, (ii) A step matched with a user group who generated many 
indexing database trees, and in whom each tree was provided under a filtering rule of said contents, (iii) 
Arbitrary information sites which are in agreement with an exclusion pattern, URL, or a step which avoids a 
display to a user of a document, (iv) A step which displays the document / contents pointer of sauce origin 
which is in agreement with an information site URL list permitted clearly, (v) It is set by a step which 
displays only the information networl< / contents / document according to a filtering process defined with 
said cash and said blocking engine on a user, A step which said filtering process is an interval of (aa) 
regularity, or furthermore reads a filtering rule of said contents from said cash and a filtering engine at a 
step of said (v) whenever change is detected, (bb) A method defined by a step which displays only search 
results permitted under said filtering rule to each thing or group to a user. 

[Claim 7](d) A step which corrects a contents engine scanning target to a contents cache storage rather 
than an information site/URL list, (e) A method according to claim 6 of having further a step which 
traverses URL / contents / document tree of said cash and a blocking engine by API, a database scan, and 
shared file scan. 

[Claim 8](f) A method according to claim 6 of having further a step which corrects said search engine 

contents scan and an indexing process so that it may be constituted like an end user's browser. 

[Claim 9](g) A method according to claim 6 of having further a step which corrects said contents indexing 

search engine so that said cash may be passed, when building an indexing database. 

[Claim 10](h) A method according to claim 6 of having further a step which corrects said contents indexing 

search engine in order to build an indexing database by searching said cash. 

[Claim 1 1](i) A method according to claim 6 of having further a step which connects said contents indexing 
search engine to an internal network, and a step which connects the (j) aforementioned contents indexing 
search engine to internal network operation. 

[Claim 12]A method according to claim 6 of having further a step which connects said contents indexing 
search engine to an external network, and gives compatibility to an organization contents blocking policy. 



[Translation done.] 



http://www4.ipdLinpit.ga.jp/cgi-bin/tran_web_cgi_ejje?atw_u=http://www4.ipdlj 1/15/2009 



JP,2000-357176,A [DETAILED DESCRIPTION] 
* NOTICES * 



Page 1 of 9 



JPO and INPIT are not responsible for any 
damages caused by the use of this translation. 

IThis document has been translated by computer. So the translation may not reflect the original precisely. 
2.**** shows the word which can not be translated. 
3.1 n the drawings, any words are not translated. 



DETAILED DESCRIPTION 

[Detailed Description of the Invention] 
[0001] 

[Field of the lnvention]This invention relates to an information retrieval system. Especially this invention 
relates to the contents indexing search system and method of giving the search results which are in 
agreement with the policy which performs filtering and blocking of contents which were implemented by the 
blocking engine. 

[0002] [Description of the Prior Art]explosive growth of a text and multimedia contents available in the 
Internet, or other data networks and systems - an end user - Seki - in order to discover prudent 
information, it depends on the text and the key word retrieving tool increasingly. An end user inputs into a 
retrieving tool or a search engine the key word showing the information and document for which it is 
asking. And the list of pointers to the document which a retrieving tool or a search engine refers to the 
existing indexing database, and is considered to be interesting with the title of a document. It returns with 
description of several lines which -consists of a text which was extracted from the document main part in 
many cases. Then, an end user navigates some of pointers which have returned for search, or all, and 
peruses the contents on a actual document or on-line. Generally a search engine indexing database A 
contents source. An automatic program is.started to (for example, the Internet website), By searching a 
route contents source automatically like the link to a contents tree (in many cases, it moves to other sites), 
and carrying out the indexing of the information further included in a database for future search, it is built 
automatically or is built semi-automatically. The search and the indexing which were autoniated to a large 
contents source like the website on the Internet are the uniquely practical method of making an index 
retrieving database. 

[0003]lt follows on the diversity of available information increasing to an on-line system and a network, a 
company, an individual, a group, and a network service provider (NSP), A policy and management which 
restrict the ease of carrying out of the acquisition of such contents for screening the contents it is 
considered that are things unsuitable for an end user or unnecessary are carried out increasingly. Such a 
contents managing policy is blocked so that a predetermined on-line service and a network end user may 
generally receive wholly or in part and unnecessary contents may not arrive. Blocking of contents is 
performed with other devices inserted between the contents sources generally made into a contents proxy 
gateway, a data network firewall or an end user, and the purpose. It often prevents that filtering of contents 
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is realized as some contents cash engines, only the contents of a request for user parent population are 
held at cash, and cash of the unnecessary contents is carried out. Only by all the users' letting cash pass, 
they can access network contents, blocking - Contents are carried out when it is what is seen at harmful to 
a user group or a business use or unsuitable generally, othenA/ise the specific time of the days. NSP and a 
company will often depend on the plat form (PICS) for an assessment system or service, for example, 
Internet-contents selection, in order to judge the conformity of the content site about specific parent 
population, or a document. An end user may choose a series of blocking policies imposed by themselves 
by some systems. 

[0004]An important problem surfaces to NSP and the data transfer provider who are located between the 
necessity for the blocking engine which blocks automatically that the necessity for a ********** automatic 
search engine and some of contents reach an end user eventually in a lot of contents. Especially a 
problem is lack of the unification with the engine and search engine which carry out filtering and a blocking 
policy, and cooperation-izing. Lack of unification is produced for several reasons, namely, (a) - many 
organizations are adopting and carrying out the policy which performs filtering and blocking - of contents to 
the site of this organization depending on a search engine, for example, a search engine available on the 
Internet, or service, (b) There is a tendency for the search engine to have to search and carry out the 
indexing of as many contents as possible [ intentional ], and to ask for all the contents positively. On the 
other hand, in filtering and a blocking engine, selection of the document which stores in cash and is 
eventually displayed to an end user is planned intentionally. 

[0005]Since a role which is intrinsically [ between a search engine and a blocking engine ] different with 
execution (implementation) efficiency and the demand which receives highly efficiently exists, cooperation 
is prevented from unification and cooperation of these two information retrieval functions. 
[0006]Based on filtering/blocking policy, eventually the above-mentioned problem in spite of the contents 
document which cannot be accessed. It becomes clear according to the fact that description and the title of 
such a contents document are displayed to an end user at the same time it uses service of a search 
engine, or [ that these selves of the title and a short description which are returned by the search engine 
other than inconvenient / of an end user / or frustration by there being no coherence are dramatically 
unpleasant ] - or - otherwise, it is not desirable. 

[0007]Therefore, the performance blocking method and the information retrieval system with which the 
search results been [ search results / it ] in agreement or adjusted are obtained are called for by few 
protocols and the performance effect of the grade which can be performed. 
[0008]The following are mentioned as contents indexing search and a blocking system. 
[0009]U.S. Pat. No. 5,701,469 (Brandii et al.) published on December 23, 1997 removes from a result the 
preservation search results included accidentally, The contents indexing search system which performs the 
search-results collection routine which adds the preservation search results excepted accidentally is 
indicated. Thus, the search results which answered a user's inquiry and were produced are correctly made, 
even if the contents index used to generate the first search results is not the newest thing. 
[OOlOjU.S. Pat. No. 5,835,722 (Brandshaw et al.) for which it applied on June 27, 1996 and which was 
published on November 10, 1998, search - by supervising comprehensively the computer operation for 
generation of an unsuitable material, or transmission, the terminal for blocking use and transmission of an 
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unsuitable material is indicated. 

Terminal blocking is carried out by it or it is blocked by only surveillance intervention. 

[0011 ]U.S. Pat. No. 5,706,507 (Schloss) for which it applied on July 5, 1995 and which was published on 
January 6, 1998, In order to block or perceive an unnecessary material, the advisory server operated by 
the third party who evaluates the contents of the data downloaded from the contents server is indicated. 
[0012JU.S. Pat. No. 5,619,648 (Canale et al.) published on April 8, 1997 is indicating E-mail filtering which 
judges whether a user should be provided with the message of an E-mail based on the model of the 
relation corresponding to a user. 

[001 3]. As [ become / what only the contents to which any conventional technology was permitted by the 
blocking policy were returned to the end user as a result of content retrieval, and corresponded with the 
blocking policy ] The contents indexing search system which provides the search results which suit the 
blocking policy realized by the blocking engine Is not indicated. 
[0014] 

[Problem(s) to be Solved by the lnvention]The 1st purpose of this invention Is to provide the information 
retrieval system and method about operation of giving compatibility between a search engine result and a 
contents blocking policy which were improved. 

[0015]The 2nd purpose of this invention is to provide the contents indexing search system and method 
about operation of giving a blocking policy and the search results to adjust which were improved. 
[0016]The 3rd purpose of this invention Is to provide the contents indexing search system and method of 
■ performing a blocking policy with cash and a filtering engine which were improved. 
[0017]The 4th purpose of this invention is to provide the contents indexing search system and method 
about operation of performing a blocking policy which were improved during a contents indexing phase. 
[0018]The 5th purpose of this invention Is to provide the contents indexing search system and method 
about operation of performing a blocking policy which were improved during the phase which displays an 
end user's search results. 

[0019]The 6th purpose of this invention is to provide the contents indexing search system and method 
about the operation which searches local REPOSHITORI of cash and a blocking engine instead of, and 
carries out the indexing of a final content site and contents server which were improved. [ search ] 
[0020]The 7th purpose of this invention Is to provide the contents indexing search system and method 
about operation used as the composition of passing cash and a filtering engine towards target contents 
which were improved. 
[0021] 

[Means for Solving the Problem]These purposes and other purposes, the feature, and an advantage, In 
order to perform policy control which generally blocks unnecessary contents so that it may be in agreement 
with an end user's organization filtering and a blocking policy which are performed by embodiment with 
another search results. It is attained by information retrieval network containing a contents indexing search 
engine with a cash engine which combines between a search engine and end users, and a database. 
[0022]ln the 1st example of an embodiment, only contents permitted by blocking policy are added to a 
search engine indexing database. In the 2nd example of an embodiment, search and a display process of 
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a search engine are corrected in order to perform a blocking policy. In the 3rd example of an embodiment, 
by searching contents of a cash engine, operation of a search engine and a target of an indexing 
automaton process are corrected so that an indexing database may be built. In the 4th example of an 
embodiment, a scan and an indexing automaton of a search engine are constituted like an end user's 
browser, and in order to reach [ namely, ] target contents, they pass cash and a filtering engine. 
[0023] 

[Embodiment of the lnvention]ln drawing 1 , the information retrieval system 100 has two or more client 
devices 102,104 connected with the Internet or other distributed data networks 106 via an inside or the 
controlled network 107. A typical client is a personal computer (PC) with the display 110, the keyboard 1 11, 
CPU 112, the memory 113, and network connection nature I/O device 115. The home user of PC 
connected to the business user of PC linked to the network inside a company or a service provider's 
network as an example of such a client and a network is mentioned, and also when it is which, the final 
more big Internet is accessed. The browser 116 currently sold with trademarks, such as Netscape 
Communicator and IBM Web Explorer, is installed in the memory 113 with the standard operating system 
117 and the application program 1 18. The browser 116 is run or performed with the client devices 102 and 
104, in order that contents may read or download from the contents server 120 connected with the Internet 
106. Each contents server has the database 122 for storing the data which can answer to the contents 
request from the client 102 and 104 grades. As one gestalt, data is stored as a meeting of the HTML 
document containing a text and other multimedia contents. 

[0024]Generally the gateway 124 is used as an interface of two or more clients or the internal network 
segment 107, and the Internet 106, as shown in a figure. Usually, the proxy server which has cash and the 
filtering engine 126 of contents is inserted in the connection path from the internal network 107 to the 
Internet 106, and performance and management are increased by realizing a contents blocking policy. It 
can connect with a gateway or cash INGU and a blocking proxy server can be directly connected in parallel 
with the Internet 107 and the external network 106. 

[0025]The client systems 102 and 104 which perform the web browser 116 require contents from the 
contents server 120 using a HTTP (Hypertext Transfer Protocol) demand, and receive contents by HTTP 
response. An HTTP request and a response are produced on the TCP/IP socket transmitted on the 
communication link between a client and a contents server. A user generates a contents request by 
whether the contents stored in the contents server are required clearly, or the hyperlink anchor which 
points out the contents stored in the contents server is taken. If it receives, a browser loads the contents 
which use an HTTP session. The detailed explanation about HTTP is shown Tn Hypertext Transfer 
Protocol-HTTPn.O" Draft IEFT-HTTP-V10-Spec-0.0 Text 1995 (March 8) of Berners-Lee and others, On 
these specifications, the whole contents of this literature are used as what makes some of these 
specifications. The detailed explanation about HTML is shown in "Hypertext Markup Language(HTML)" 
Draft IEFT.IIIR-HTML-01 of Berners-Lee, and June 1993 (draft out of print), On these specifications, the 
whole contents of this literature are used as what makes some of these specifications. The detailed 
explanation about a TCP/IP socket, W. It is shown in "TCP/IP Illustrated and Vol.1 - The Protocols" of 
Richard Stevens, Addison-Westlake, 1994 pages 1-20, and 229-262, On these specifications, the whole 
contents of this literature are used as what makes some of these specifications. 
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[0026]The user of the client system using the web browser 116, When the most, the conventional search 
engine servers 130 and 135 and the databases 131 and 136 which were limited in order to trace the place 
of Internet contents by a retrieval-by-keyword means are accessed, respectively. These fixed engine 
servers turn into the external search engine server 130'or the internal search engine server 135 to the 
managed network 107. While they perform the same basic function, by an internal network operator, the 
search engine server 135 combined with the inside is managed independently, and is rather liked. This 
invention of process will be realized by the search engine 135 which generally combines with an inside and 
is managed, or the external search engine 130 which is consistent as service to an organization and gives 
an organization a contents blocking policy. As a result of the retrieval by keyword turned to the search 
engine server 130 or 135, an end user, The list of URL which is displayed as a hyperlink anchor to final 
contents and which text-made an extract and was in agreement will be seen displayed on the web browser 
116. The user can choose and follow a link to contents coincidence of one or more using the web browser 
116. 

[0027]ln cirawing 2 , the table 200 of . composition of filtering/blocking a contents sample by a client. Or it is 
generated by a network / service administrator, and in order to filter the validity of the contents consider 
that are unsuitable or unnecessary or to limit, it is installed by the proxy server 126. Generally these 
contents access control methods block with a predetermined on-line service or network all the end users or 
that receive in part and the contents which are not desirable arrive. A table is installed in cash and the 
filtering engine 126, and is usually stored in the database 127. As one gestalt, a table receives each user 
or each user group, A user. Or a group identification descriptor. (ID) The list 205 of keywords to 203 and 
blocking the PICS (Platform For Interconnect Content Selection) rule 207, the black list 209 of URL which 
should not contact, And even the inside of the white list 21 1 of only URL which can contact has the line 
201 containing the above thing. Explanation of URL is used as "Uniform Resource Locators (URL)" of 
Berners-Lee and others, and a thing which is shown in RFC 1738 and December 1994 and makes some of 
these specifications for the whole contents of this literature on these specifications. PICS evaluation is 
obtained from the PICS rule which is based on the PIC label incorporated into the document which 
describes URL, and permits or blocks access to URL. The PICS rule is indicated to http: [ on the Internet 
published by W3C etc. ]//www.w3.org/TR/REC-PICSRules-971229. Especially a PICS rule is a language 
for expressing the filtering rule (profile) which permits or blocks access to URL based on the PICS label 
which describes those URL. A label is generated by using the software tool based on available PICS 
Technicalspecification-1.1 by http: [ on the lnternetJ//www.w#.org/PICS/. A software tool is used in order to 
generate a label by the document which describes specific URL. Or the reader independent of others 
distributes a label through another server called Ravel Bureau instead of sticking a label into a document. 
Filtering software will know investigating in Ravel Bureau, in order to find Ravel the same with knowing 
reading the specific magazine in which consumers criticize an instrument or a private vehicle. Once Ravel 
is generated, Ravel will be inserted in the HTTP header stream preceded with the contents of the 
document sent to the web browser as an additional header. Or Ravel can be embedded on a HTML 
document using a META tag. By this method, Ravel is not a picture, video, or other things, either, and is 
seen off only by a HTML document. It is available from International Business Machines Corporation 
(Armonk, NY) in the contents server corresponding to PICS-. 
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[0028]On the blocking table installed in cash and a blocking engine. It is available in order for some 
process choices to combine a contents search and a contents blocking engine, and only the contents 
eventually permitted by the blocking policy are returned to a client as a result of content retrieval. While it is 
possible to have a different rule to an individual user, it is much more easy to process having a single set of 
the rule applied to all the users, or that each divides a user into a small number of user group with an 
original rule. Supposing an individual user or a group is set, it can be identified by using some means by 
him. As such a means, HTTP basic authentication is used, for example at the time of the start of the client 
system IP address and browsing section to a user / group ID mapping, Using cookies of HTTPweb for 
pursuing a user identification is mentioned. [0029]lf it returns to drawing 3 , the process 300 will perform a 
blocking policy between contents indexing phases. In Step 302, the contents operation and the indexing 
automaton process by the search engine 135 are corrected. At Step 304, the filtering rule of contents and 
the contents from the filtering engine 126 at the usual intervals. Or whenever change is detected, it is read 
into the search engine server 135 via transmission of an application program interface (API) or a rule 
definition file. In Step 306, many indexing database trees are generated if needed, and each tree is 
matched with a user group, as it defined as the filtering rule of contents. For example, one indexing 
database tree which has attached the PIC filtering rule strict as an object for children is defined, and one 
Indexing database tree with a more nearly free filtering rule as an object for adults is defined. 
[0030]ln Step 308, the automaton process of a search engine starts a scan, the scan of indexing contents, 
and an indexing from the list of target servers, while the contents blocking rule is investigated. 
[0031]Supposing a white list exists, a search engine will search with Step 310 only the website or route 
contents source which is clearly in agreement with a permissible site / URL list, or white list. 
[0032]ln Step 312, if the black list of URL which should be excepted is set up into a rule, website URL 
which is in agreement with a black list pattern will be excepted anythings. 

[0033]ln Step 314, it is applied to the site / contents / document currently processed, and a document is 
excepted as the result, or the PICS rule applied to the user set served with the indexing database tree is 
included. 

[0034]ln Step 316, if the list of keywords to except is specified, a document text will be scanned, and if one 
or more keywords are contained in a list, a document will be removed. 

[0035]ln Step 318, only when it approves under a filtering rule about the group, a document is added to a 
suitable indexing database. 

[0036]As for the advantage of the process of drawing 3 , processing (exclusion) of all the additions is 
performed in a database indexing phase. There is almost no additional processing needed in user retrieval 
processing and a presentation phase. Though contents are probably rescanned to a possible change, 
retrieving operation is performed by the life cycle of a search engine quite more frequently than indexing 
operation. 

[0037]ln drawing 4 , another process 400 performs a blocking policy between end user search-results 
display phases. In Step 402, the scan of a search engine and an indexing automaton process do not 
change, but the single indexing database tree is maintained. In Step 404, in order to apply a blocking 
policy, search and the display process of a search engine are corrected. 

[0038]ln Step 406, the filtering rule (passing transmission of an API course or a rule definition) of the 
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contents from a cash engine is a constant interval, or whenever change is detected, it is read into a search 
engine. 

[0039]ln Step 407, processing of the retrieval required which was begun by the user is started to an index 
database. [0040]ln Step 408, the list of all the congruous documents on which a users request is satisfied 
is created, and it prepares for application of a blocking rule. 

[0041]ln Step 41 0, if the white file of URL permitted clearly is specified in a rule, all the documents in 
agreement which are not contained in a white list will be excepted. 

[0042]ln Step 412, either the website which is in agreement with an exclusion pattern list (black list) URL or 
a document is excepted. 

[0043]ln Step 414, supposing a PICS rule is specified, URL.which was not checked under a PICS rule will 
be excepted also which. 

[0044]ln Step 416, supposing the keyword list is specified as the rule, each URL which contains one or 
more keywords in a list in a text will be excepted. 

[0045]The remaining subsets of a URL pointer that are satisfied with Step 418 of a blocking rule in 
accordance with a user's demand are returned for the display to a client. 

[0046]The 1st advantage of the process of drawing 4 is that the newest policy can apply without the 
influence which reconstructs an indexing database to each search. A single indexing database can be 
used to all the users. The definition which lowers to management of each [ without influence ] almost, and 
changes a filtering group by this process is attained. 

[0047]The indexing database is built by the process 500 correcting a search engine and searching with 
drawing 5 t he contents of the engine which is carrying out cash of the contents. In Step 501 , a search 
engine scan and an indexing automaton process are corrected. A scan and instead of carrying out an 
indexing, a process is set up in a final contents source site search the local repository of the contents of 
cash and a blocking engine. The scanning target of a search engine is corrected to the memory storage 
which is carrying out cash of the suitable contents more nearly rather than a site/URL list in Step 503. In 
Step 505, URL / contents / document tree of cash and a blocking engine are traversed via API, database 
operation, or a share field system operation. In Step 507, since the blocking filtering method to one or more 
user groups is followed in a local installation, the arbitrary documents found out by cash are added to an 
indexing database. . 

[0048]The 1st advantage of the process of drawing 5 is that applying filtering and a blocking rule is 
performed once [ only ] with the engine designed such, i.e., cash, and a blocking engine. A scan and an 
indexing scan are performed on the partial (high performance) copy of target contents rather; than a more 
nearly variable Internet-contents site. j 
[0049]ln drawing 6 , since a search engine builds the indexing database of itself, the process 600 corrects a 
search engine, in order to pass cash and a filtering engine. In Step 601, a search engine scan and an 
indexing automaton are corrected, and it builds like an end user's browser. That is, since target contents 
are reached, a HTTP pro:j:y is used and cash and a filtering engine are passed. In Step 603, the search 
engine automaton has composition which uses the HTTP proxy constituted to suitable cash and a filtering 
engine. Step 605 - contents a scan - and a search engine automaton, while carrying out an indexing. It 
has composition which simulates the end user belonging to one of the user groups so that a user may 
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receive only the subset containing the site / contents / document permitted by a user group's policy. 
[0050]The 1st advantage of the process of drawing 6 is not correcting a search engine substantially. 
Contents blocking and filtering are performed with the cash and the blocking engine which were designed 
and optimized such. Only the contents permitted by the blocking policy reach a search engine for an 
indexing. Since some of sites / contents by which a scan and an indexing should be made are found out by 
local cache memory storage, the efficiency and performance of a search engine increase. 
[0051 ]As explained above, content retrieval and a contents blocking engine join together so that only the 
pointer to the contents eventually permitted by the blocking policy may return to an end user as a result of 
content retrieval. Many processes are described in order to combine content retrieval and a contents 
blocking engine. As such a thing, this invention gives filtering of the contents of the result of an end user's 
content retrieval, and each organization, and the compatibility between blocking policies. This invention 
can be immediately used without the necessity for change to the existing Internet, other networks, i.e., a 
data protocol, and standard. 

[0052]Although this invention was explained about the internet (HTTP/Web) environment, the same 
concept is applied to almost all the data and network environment with which data is searched. The list of 
possible coincidence will be displayed to the end user who consumes / looks through data in order, if it 
approves with access or a contents management system. Various change is possible, without deviating 
from the pneuma and the range of this invention, as it was defined as the claim. 
[0053]As a conclusion, the following matters are indicated about the composition of this invention. 

(1) The contents indexing search engine which is a contents indexing search system which provides the 
search results which are in agreement with filtering of contents, and blocking restrictions, and contains a 
database, The cash and the blocking proxy server containing cash. The information network combined with 
said contents indexing search engine, The means for performing a search inquiry to said contents indexing 
search engine, and receiving search results from said cash, The blocking engine which combines with said 
contents indexing search engine, and performs filtering and the blocking policy of contents, A contents 
indexing search system provided with the means for correcting said contents indexing search engine and 
performing the same contents blocking policy as said blocking engine. 

(2) A system given in the above (1) further provided with the means for performing said blocking policy by 
said contents indexing search engine during a contents indexing phase. 

(3) A system given in the above (1) further provided with the means for performing said blocking policy 
during an end user search-results display phase. 

(4) A system given Tn the above (1) further provided with a means to correct said contents indexing search 
engine in order to build an indexing database by carrying out the index of the cash contents. 

(5) A system given in the above (1) further provided with the means for correcting said contents indexing 
search engine in order to incorporate said cash and a blocking engine result, when said contents indexing 
search engine builds an indexing database. 

(6) The contents indexing search engine combined with a database and cash. The information network 
combined with said contents indexing search engine, In the contents indexing search system which has a 
blocking engine which performs filtering and blocking restrictions of contents to the search results with 
which an end user is provided via said cash, It is the method of providing the search results which are in 
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agreement with filtering of contents, and implementation of blocking (a) The step to which arbitrary 
information site URL which changes the process of said contents indexing search engine, and is in 
agreement with an exclusion pattern is made to skip, (b) The step which searches only the site or route 
contents source which is in agreement with the information site URL list which changes the process of said 
contents indexing search engine, and is permitted clearly, (c) having a step which performs the filtering 
policy defined with said cash and a blocking engine by said contents indexing search engine - the step of 
the above (C) - said filtering policy - (i), whenever it is a fixed interval or change is detected, The step 
which reads the filtering rule of contents from said cash and a filtering engine, (ii) The step matched with 
the user group who generated many indexing database trees, and in whom each tree was provided under 
the filtering rule of said contents, (iii) The arbitrary information sites which are in agreement with an 
exclusion pattern, URL, or the step which avoids the display to the user of a document, (iv) The step which 
displays the document / contents pointer of the sauce origin which is in agreement with the information site 
URL list permitted clearly, (v) It is set by the step which displays only the information network / contents / 
document according to the filtering process defined with said cash and said blocking engine on a user, The 
step which said filtering process is an interval of (aa) regularity, or furthermore reads the filtering rule of 
said contents from said cash and a filtering engine at the step of said (v) whenever change is detected, 
(bb) Method defined by the step which displays only the search results permitted under said filtering rule to 
each thing or group to a user. 

(7) The step which corrects (d) contents engine scanning target to a contents cache storage rather than an 
information site/URL list, (e) A method given in the above (6) which has further a step which traverses 
URL / contents / document tree of said cash and a blocking engine by API, a database scan, and shared 
file scan. 

(8) Method given in the above (6) which has further a step which corrects said search engine contents 
scan and an indexing process so that it may be constituted like the (f) end user's browser. 

(9) Method given in the above (6) which has further a step which corrects said contents indexing search 
engine so that said cash may be passed, when building (g) indexing database. 

(10) and (h) - a method given in the above (6) which has further a step which corrects said contents 
indexing search engine in order to build an indexing database by searching said cash. 

(1 1) - (i) - the step which connects said contents indexing search engine to an internal network, and (j) -- 
a method given in the above (6) which has further a step which connects said contents indexing search 
engine to internal network operation. (12) A method given in the above (6) which has further a step which 
connects said contents indexing search engine to an external network, and gives compatibility to an 
organization contents blocking policy. 
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TECHNICAL FIELD 

[Field of the lnvention]This invention relates to an information retrieval system. Especially this invention 
relates to the contents indexing search system and method of giving the search results which are in 
agreement with the policy which performs filtering and blocking of contents which were implemented by the 
blocking engine. 
[0002] 

[Translation done.] 
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PRIOR ART 

[Description of the Prior Art]explosive growth of a text and multimedia contents available in the Internet, or 
other data networks and systems - an end user - Seki - in order to discover prudent information, it 
depends on the text and the key word retrieving tool increasingly. An end user inputs into a retrieving tool 
or a search engine the key word showing the information and document for which it is asking. And the list 
of pointers to the document which a retrieving tool or a search engine refers to the existing indexing 
database, and is considered tq be interesting with the title of a document. It returns with description of 
several lines which consists of a text which was extracted from the document main part in many cases. 
Then, an end user navigates some of pointers which have returned for search, or all, and peruses the 
contents on a actual document or on-line. Generally a search engine indexing database A contents source. 
An automatic program is started to (for example, the Internet website), By searching a route contents 
source automatically like the link to a contents tree (in many cases, it moves to other sites), and carrying 
out the indexing of the information further included in a database for future search, it is built automatically 
or is built semi-automatically. The search and the indexing which were automated to a large contents 
source like the website on the Internet are the uniquely practical method of making an index retrieving 
database. 

[0003]lt follows on the diversity of available information increasing to an on-line system and a netvyork, a 
company, an individual, a group, and a network service provider (NSP), A policy and management which 
restrict the ease of carrying out of the acquisition of such contents for screening the contents it is 
considered that are things unsuitable for an end user or unnecessary are carried out increasingly. Such a 
contents managing policy is blocked so that a predetermined on-line service and a network end user may 
generally receive wholly or in part and unnecessary contents may not arrive. Blocking of contents is 
performed with other devices inserted between the contents sources generally made into a contents proxy ' 
gateway, a data network firewall or an end user, and the purpose. It often prevents that filtering of contents 
is realized as some contents cash engines, only the contents of a request for user parent population are 
held at cash, and cash of the unnecessary contents is carried out. Only by all the users* letting cash pass, 
they can access network contents, blocking - Contents are carried out when it is what is seen at harmful to 
a user group or a business use or unsuitable generally, otherwise the specific time of the days. NSP and a 
company will often depend on the plat form (PICS) for an assessment system or service, for example, > 
Internet-contents selection, in order to judge the conformity of the content site about specific parent 
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population, or a document. An end user may choose a series of blocking policies imposed by themselves 
by some systems. 

[0004]An important problem surfaces to NSP and the data transfer provider who are located between the 
necessity for the blocking engine which blocks automatically that the necessity for a ********** automatic 
search engine and some of contents reach an end user eventually in a lot of contents. Especially a 
problem is lack of the unification with the engine and search engine which carry out filtering and a blocking 
policy, and cooperation-izing. Lack of unification is produced for several reasons, namely, (a) many 
organizations are adopting and carrying out the policy which performs filtering and blocking - of contents to 
the site of this organization depending on a search engine, for example, a search engine available on the 
Internet, or service, (b) There is a tendency for the search engine to have to search and carry out the 
indexing of as many contents as possible [ intentional ], and to ask for all the contents positively. On the 
other hand, in filtering and a blocking engine, selection of the document which stores in cash and is 
eventually displayed to an end user is planned intentionally. . 

[0005]Since a role which is intrinsically [ between a search engine and a blocking engine ] different with 
execution (implementation) efficiency and the demand which receives highly efficiently exists, cooperation 
is prevented from unification and cooperation of these two information retrieval functions. 
[0006]Based on filtering/blocking policy, eventually the above-mentioned problem in spite of the contents 
document which cannot be accessed, It becomes clear according to the fact that description and the title of 
such a contents document are displayed to an end user at the same time it uses service of a search 
engine, or [ that these selves of the title and a short description which are returned by the search engine 
other than inconvenient / of an end user / or frustration by there being no coherence are dramatically 
unpleasant ] - or - othenA/ise, it is not desirable. 

[0007]Therefore, the performance blocking method and the information retrieval system with which the 
search results been [ search results / it ] in agreement or adjusted are obtained are called for by few 
protocols and the performance effect of the grade which can be performed. 
[0008]The following are mentioned as contents indexing search and a blocking system. 
[0009]U.S. Pat. No. 5,701,469 (Brandii et al.) published on December 23, 1997 removes from a result the 
preservation search results included accidentally, The contents indexing search system which performs the 
search-results collection routine which adds the preservation search results excepted accidentally is 
indicated. Thus, the search results which answered a user's inquiry and were produced are correctly made, 
even if the contents index used to generate the first search results is not the newest thing. 
[0010]U.S. Pat. No. 5,835,722 (Brandshaw et al.) for which it applied on June 27, 1996 and which was 
published on November 10, 1998, search - by supen/ising comprehensively the computer operation for 
generation of an unsuitable material, or transmission, the terminal for blocking use and transmission of an 
unsuitable material is indicated. 

Terminal blocking is carried out by it or it is blocked by only surveillance intervention. 

[0011]U.S. Pat. No. 5,706,507 (Schloss) for which it applied on July 5, 1995 and which was published on 
January 6, 1998, In order to block or perceive an unnecessary material, the advisory server operated by 
the third party who evaluates the contents of the data downloaded from the contents server is indicated. 
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[0012]U.S. Pat. No. 5,619,648 (Canale et al.) published on April 8, 1997 is indicating E-mail filtering which 
judges whether a user should be provided with the message of an E-mail based on the model of the 
relation corresponding to a user. 

[0013]. As [ become / what only the contents to which any conventional technology was permitted by the 
blocking policy were returned to the end user as a result of content retrieval, and corresponded with the 
blocking policy ] The contents indexing search system which provides the search results which suit the 
blocking policy realized by the blocking engine is not indicated. 
[0014] 
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TECHNICAL PROBLEM 

[Problem(s) to be Solved by the lnvention]The 1st purpose of this invention is to provide the information 
retrieval system and method about operation of giving compatibility between a search engine result and a 
contents blocking policy which were improved. 

[0015]The 2nd purpose of this invention is to provide the contents indexing search system and method 
about operation of giving a blocking policy and the search results to adjust which were improved. 
[0016]The 3rd purpose of this invention is to provide the contents indexing search system and method of 
performing a blocking policy with cash and a filtering engine which were improved. 
[0017]The 4th purpose of this invention is to provide the contents indexing search system and method 
about operation of performing a blocking policy which were improved during a contents indexing phase. 
[0018]The 5th purpose of this invention is to provide the contents indexing search system and method 
about operation of performing a blocking policy which were improved during the phase which displays an 
end user's search results. 

[0019]The 6th purpose of this invention is to provide the contents indexing search system and method 
about the operation which searches local REPOSHITORI of cash and a blocking engine instead of, and 
carries out the indexing of a final content site and contents server which were improved. [ search ] 
[0020]The 7th purpose of this invention is to provide the contents indexing search system and method 
about operation used as the composition of passing cash and a filtering engine towards target contents 
which were improved. 
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MEANS 

[Means for Solving the Problem]These purposes and other purposes, the feature, and an advantage, In 
order to perform policy control which generally blocks unnecessary contents so that it may be in agreement : 
with an end user's organization filtering and a blocking policy which are performed by embodiment with 
another search results, It is attained by information retrieval network containing a contents indexing search 
engine with a cash engine which combines between a search engine and end users, and a database. 
[0022]ln the 1st example of an embodiment, only contents permitted by blocking policy are added to a 
search engine indexing database. In the 2nd example of an embodiment, search and a display process of 
a search engine are corrected in order to perform a blocking policy. In the 3rd example of an embodiment, 
by searching contents of a cash engine, operation of a search engine and a target of an indexing 
automaton process are corrected so that an indexing database may be built. In the 4th example of an 
embodiment, a scan and an indexing automaton of a search engine are constituted like an end user's 
browser, and in order to reach [ namely, ] target contents, they pass cash and a filtering engine. 
[0023] 

[Embodiment of the InventionJIn drawing 1 , the information retrieval system 100 has two or more client 
devices 102,104 connected with the Internet or other distributed data networks 106 via an inside or the 
controlled network 107. A typical client is a personal computer (PC) with the display 110, the keyboard 111, 
CPU 1 12, the memory 113, and network connection nature I/O device 115. The home user of PC 
connected to the business user of PC linked to the network inside a company or a service provider's 
network as an example of such a client and a network is mentioned, and also when it is which, the final 
niore big Internet is accessed. The browser 116 currently sold with trademarks, such as Netscape 
Communicator and IBM Web Explorer, is installed in the memory 1 13 with the standard operating system 
117 and the application program 118. The browser 1 16 is run or performed with the client devices 102 and 
104, in order that contents may read or download from the contents server 120 connected with the Internet 
106. Each contents server has the database 122 for storing the data which can answer to the contents 
request from the client 102 and 104 grades. As one gestalt, data is stored as a meeting of the HTML 
document containing a text and other multimedia contents. 

[0024]Generally the gateway 124 is used as an interface of two or more clients or the internal network 
segment 107, and the Internet 106, as shown in a figure. Usually, the proxy server which has cash and the h 
filtering engine 126 of contents is inserted in the connection path from the internal network 107 to the 

http://www4.ipdl.inpit.go.jp/cgi-biri/tran_web_cgi_ejje?atw_u=http%3A%2^^^^ 1/15/2009 : 



JP,2000-357176,A [MEANS] Page 2. of 7 

Internet 106, and performance and management are increased by realizing a contents blocking policy. It 
can connect with a gateway or cash INGU and a blocking proxy server can be directly connected in parallel 
with the Internet 107 and the external network 106. 

[0025]The client systems 102 and 104 which perform the web browser 116 require contents from the 
contents server 120 using a HTTP (Hypertext Transfer Protocol) demand, and receive contents by HTTP 
response. An HTTP request and a response are produced on the TCP/IP socket transmitted on the 
communication link between a client and a contents server. A user generates a contents request by 
whether the contents stored in the contents server are required clearly, or the hyperlink anchor which 
points out the contents stored in the contents server is taken. If it receives, a browser loads the contents 
which use an HTTP session. The detailed explanation about HTTP is shown in Hypertext Transfer 
Protocol-HTTPn.O" Draft IEFT-HTTP-VIO-Spec-0,0 Text 1995 (March 8) of Berners-Lee and others, On 
these specifications, the whole contents of this literature are used as what makes some of these 
specifications. The detailed explanation about HTML is shown in "Hypertext Markup Language(HTML)" 
Draft IEFT.IIIR-HTML-01 of Berners-Lee, and June 1993 (draft out of print), On these specifications, the 
whole contents of this literature are used as what makes some of these specifications. The detailed 
explanation about a TCP/IP socket, W. It is shown in "TCP/IP Illustrated and Vol.1 - The Protocols" of 
Richard Stevens, Addison-Westlake, 1994 pages 1-20, and 229-262, On these specifications, the whole 
contents of this literature are used as what makes some of these specifications. 

[0026]The user of the client system using the web browser 116, When the most, the conventional search 
engine servers 130 and 135 and the databases 131 and 136 which were limited in order to trace the place 
of Internet contents by a retrieval-by-keyword means are accessed, respectively. These fixed engine 
servers turn into the external search engine server 130 or the internal search engine server 135 to the 
managed network 107. While they perform the same basic function, by an internal network operator, the 
search engine server 135 combined with the inside is managed independently, and is rather liked. This 
invention of process will be realized by the search engine 135 which generally combines with an inside and 
is managed, or the external search engine 130 which is consistent as service to an organization and gives 
an organization a contents blocking policy. As a result of the retrieval by keyword turned to the search 
engine server 130 or 135, an end user. The list of URL which is displayed as a hyperlink anchor to final 
contents and which text-made an extract and was in agreement will be seen displayed on the web browser 
116. The user can choose and follow a link to contents coincidence of one or more using the web browser 
116. 

[0027]ln drawing 2 , the table 200 of composition of filtering/blocking a contents sample by a client. Or it is 
generated by a network / service administrator, and in order to filter the validity of the contents consider 
that are unsuitable or unnecessary or to limit, it is installed by the proxy server 126. Generally these 
contents access control methods block with a predetermined on-line service or network all the end users or 
that receive In part and the contents which are not desirable arrive. A table is installed in cash and the 
filtering engine 126, and is usually stored in the database 127. As one gestalt, a table receives each user 
or each user group, A user. Or a group identification descriptor. (ID) The list 205 of keywords to 203 and 
blocking -, the PICS (Platform For Interconnect Content Selection) rule 207, the black list 209 of URL which 
should not contact, And even the inside of the white list 21 1 of only URL which can contact has the line 
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201 containing the above thing. Explanation of URL is used as "Uniform Resource Locators (URL)" of 
Berners-Lee and others, and a thing which is shown in RFC 1738 and December 1994 and makes some of 
these specifications for the whole contents of this literature on these specifications. PICS evaluation is 
obtained from the PICS rule which is based on the PIC label incorporated into the document which 
describes URL, and permits or blocks access to URL. The PICS rule is indicated to http: [ on the Internet 
published by W3C etc. ]//www.w3.org/TR/REC-PICSRules-971229. Especially a PICS rule is a language 
for expressing the filtering rule (profile) which permits or blocks access to URL based on the PICS label 
which describes those URL. A label is generated by using the software tool based on available PICS 
Technicalspecification-1.1 by http: [ on the Internet ]//www.w#.org/PICS/. A software tool is used in order to 
generate a label by the document which describes specific URL. Or the reader independent of others 
distributes a label through another server called Ravel Bureau instead of sticking a label into a document. 
Filtering software will know investigating in Ravel Bureau, in order to find Ravel the same with knowing 
reading the specific magazine in which consumers criticize an instrument or a private vehicle. Once Ravel 
is generated, Ravel will be inserted in the HTTP header stream preceded with the contents of the 
document sent to the web browser as an additional header. Or Ravel can be embedded on a HTML 
document using a META tag. By this method, Ravel is not a picture, video, or other things, either, and is 
seen off only by a HTML document. It is available from International Business Machines Corporation 
(Armonk, NY) in the contents server corresponding to PICS-. 

[0028]On the blocking table installed in cash and a blocking engine. It is available in order for some 
process choices to combine a contents search and a contents blocking engine, and only the contents 
eventually permitted by the blocking policy are returned to a client as a result of content retrieval. While it is 
possible to have a different rule to an individual user, it is much more easy to process having a single set of 
the rule applied to all the users, or that each divides a user into a small number of user group with an 
original rule. Supposing an individual user or a group is set, it can be identified by using some means by 
him. As such a means, HTTP basic authentication is used, for example at the time of the start of the client 
system IP address and browsing section to a user / group ID mapping, Using cookies of HTTPweb for 
pursuing a user identification is mentioned. [0029]lf it returns to drawing 3 , the process 300 will perform a 
blocking policy between contents indexing phases. In Step 302, the contents operation and the indexing 
automaton process by the search engine 135 are corrected. At Step 304, the filtering rule of contents and 
the contents from the filtering engine 126 at the usual intervals. Or whenever change is detected, it is read 
into the search engine server 135 via transmission of an application. program interface (API) or a rule 
definition file. In Step 306, many indexing database trees are generated if needed, and each tree is 
matched with a user group, as it defined as the filtering rule of contents. For example, one indexing 
database tree which has attached the PIC filtering rule strict as an object for children is defined, and one 
indexing database tree with a more nearly free filtering rule as an object for adults is defined. 
[0030]ln Step 308, the automaton process of a search engine starts a scan, the scan of indexing contents, 
and an indexing from the list of target servers, while the contents blocking rule is investigated. 
[0031]Supposing a white list exists, a search engine will search with Step 310 only the website or route 
contents source which is clearly in agreement with a permissible site / URL list, or white list. 
[0032]ln Step 312. if the black list of URL which should be excepted is set up into a rule, website URL 
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which is in agreement with a black list pattern will be excepted anythings. 

[0033]ln Step 314, it is applied to the site / contents / document currently processed, and a document is 
excepted as the result, or the PICS rule applied to the user set served with the indexing database tree is 
included. 

[0034]ln Step 316, if the list of keywords to except is specified, a document text will be scanned, and if one 
or more keywords are contained in a list, a document will be removed. 

[0035]ln Step 318, only when it approves under a filtering rule about the group, a document is added to a 
suitable indexing database. 

[0036]As for the advantage of the process of drawing 3 , processing (exclusion) of all the additions is 
performed in a database indexing phase. There is almost no additional processing needed in user retrieval 
processing and a presentation phase. Though contents are probably rescanned to a possible change, 
retrieving operation is performed by the life cycle of a search engine quite more frequently than indexing 
operation. 

[0037]ln drawing 4 , another process 400 performs a blocking policy between end user search-results 
display phases. In Step 402, the scan of a search engine and an indexing automaton process do not 
change, but the single indexing database tree is maintained. In Step 404, in order to apply a blocking 
policy, search and the display process of a search engine are corrected. 

[0038]ln Step 406, the filtering rule (passing transmission of an API course or a rule definition) of the 
contents from a cash engine is a constant interval, or whenever change is detected, it is read into a search 
engine. 

[0039]ln Step 407, processing of the retrieval required which was begun by the user is started to an index 
database. [0040]ln Step 408, the list of all the congruous documents on which a users request is satisfied 
is created, and it prepares for application of a blocking rule. 

[0041]ln Step 410, if the white file of URL permitted clearly is specified in a rule, all the documents in 
agreement which are not contained in a white list will be excepted. 

[0042]ln Step 412, either the website which is in agreement with an exclusion pattern list (black list) URL or 
a document is excepted. 

[0043]ln Step 414, supposing a PICS rule is specified, URL which was not checked under a PICS rule will 
be excepted also which. 

[0044]ln Step 416, supposing the keyword list is specified as the rule, each URL which contains one or 
more keywords in a list in a text will be excepted. 

[0045]The remaining subsets of a URL pointer that are satisfied with Step 418 of a blocking rule in 
accordance with a user's demand are returned for the display to a client. 

[0046]The 1st advantage of the process of drawing 4 is that the newest policy can apply without the 
influence which reconstructs an indexing database to each search. A single indexing database can be 
used to all the users. The definition which lowers to management of each [ without influence ] almost, and 
changes a filtering group by this process is attained. 

[0047]The indexing database is built by the process 500 correcting a search engine and searching with 
drawing 5 the contents of the engine which is carrying out cash of the contents. In Step 501 , a search 
engine scan and an indexing automaton process are corrected. A scan and instead of carrying out an 
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indexing, a process is set up in a final contents source site search the local repository of the contents of 
cash and a blocking engine. The scanning target of a search engine is corrected to the nnemory storage 
which is carrying out cash of the suitable contents more nearly rather than a site/URL list in Step 503. In 
Step 505, URL / contents / document tree of cash and a blocking engine are traversed via API, database 
operation, or a share field system operation. In Step 507, since the blocking filtering method to one or more 
user groups is followed in a local installation, the arbitrary documents found out by cash arfe added to an 
indexing database. 

[0048]The 1st advantage of the process of drawing 5 is that applying filtering and a blocking rule is 
performed once [ only ] with the engine designed such, i.e., cash, and a blocking engine. A scan and an 
indexing scan are performed on the partial (high performance) copy of target contents rather than a more 
nearly variable Internet-contents site. 

[0049]ln drawing 6 , since a search engine builds the indexing database of itself, the process 600 corrects a 
search engine, in order to pass cash and a filtering engine. In Step 601 , a search engine scan and an 
indexing automaton are corrected, and it builds like an end user's browser. That is, since target contents 
are reached, a HTTP proxy is used and cash and a filtering engine are passed. In Step 603, the search 
engine automaton has composition which uses the HTTP proxy constituted to suitable cash and a filtering 
engine. Step 605 - contents a scan - and a search engine automaton, while carrying out an indexing, It 
has composition which simulates the end user belonging to one of the user groups so that a user may 
receive only the subset containing the site / contents / document permitted by a user group's policy. 
[0050]The 1st advantage of the process of drawing 6 is not correcting a search engine substantially. 
Contents blocking and filtering are performed with the cash and the blocking engine which were designed 
and optimized such. Only the contents permitted by the blocking policy reach a search engine for an 
indexing. Since some of sites / contents by which a scan and an indexing should be made are found out by 
local cache memory storage, the efficiency and performance of a search engine increase. 
[0051] As explained above, content retrieval and a contents blocking engine join together so that only the 
pointer to the contents eventually permitted by the blocking policy may return to an end user as a result of 
content retrieval. Many processes are described in order to combine content retrieval and a contents 
blocking engine. As such a thing, this invenfion gives filtering of the contents of the result of an end user's 
content retrieval, and each organization^ and the compatibility between blocking policies. This invention 
can be immediately used without the necessity for change to the existing Internet, other networks, i.e., a 
data protocol, and standard. 

[0052]Although this invention was explained about the Internet (HTTP/Web) environment, the same 

concept is applied to almost all the data and network environment with which data is searched. The list of 

possible coincidence will be displayed to the end user who consumes / looks through data in order, if it 

approves with access or a contents management system. Various change is possible, without deviating 

from the pneuma and the range of this invention, as it was defined as the claim. 

[0053]As a conclusion, the following matters are indicated about the composition of this invention. 

(1) The contents indexing search engine which is a contents indexing search system which provides the 

search results which are in agreement with filtering of contents, and blocking restrictions, and contains a 

database, The cash and the blocking proxy server containing cash, The information network combined with 
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said contents indexing search engine, The means for performing a search inquiry to said contents indexing 
search engine, and receiving search results from said cash, The blocking engine which combines with said 
contents indexing search engine, and performs filtering and the blocking policy of contents, A contents 
indexing search system provided with the means for correcting said contents indexing search engine and 
performing the same contents blocking policy as said blocking engine. 

(2) A system given in the above (1) further provided with the means for performing said blocking policy by 
said contents indexing search engine during a contents indexing phase. 

(3) A system given in the above (1) further provided with the means for performing said blocking policy 
during an end user search-results display phase. 

(4) A system given in. the above (1) further provided with a means to correct said contents indexing search 
engine in order to build an indexing database by carrying out the index of the cash contents. 

(5) A system given in the above (1) further provided with the means for correcting said contents indexing 
search engine in order to incorporate said cash and a blocking engine result, when said contents indexing 
search engine builds an indexing database. 

(6) The contents indexing search engine combined with a database and cash, The information network 
combined with said conterits indexing search engine, In the contents indexing search system which has a 
blocking engine which performs filtering and blocking restrictions of contents to the search results with 
which an end user is provided via said cash, It is the method of providing the search results which are in 
agreement with filtering of contents, and implementation of blocking -, (a) The step to which arbitrary 
information site URL which changes the process of said contents indexing search engine, and is in 
agreement with an exclusion pattern is made to skip, (b) The step which searches only the site or route 
contents source which is in agreement with the information site URL list which changes the process of said 
contents indexing search engine, and is permitted clearly, (c) having a step which performs the filtering 
policy defined with said cash and a blocking engine by said contents indexing search engine the step of 
the above (C) - said filtering policy - (i), whenever it is a fixed interval or change is detected, The step 
which reads the filtering rule of contents from said cash and a filtering engine, (ii) The step matched with 
the user group who generated many indexing database trees, and in whom each tree was provided under 
the filtering rule of said contents, (iii) The arbitrary information sites which are in agreement with an 
exclusion pattern, URL, or the step which avoids the display to the user of a document, (iv) The step which 
displays the document / contents pointer of the sauce origin which is in agreement with the information site 
URL list permitted clearly, (v) It is set by the step which displays only the information network / contents / 
document according to the filtering process defined with said cash and said blocking engine on a user, The 
step which said filtering process is an interval of (aa) regularity, or furthermore reads the filtering rule of 
said contents from said cash and a filtering engine at the step of said (v) whenever change is detected, 
(bb) Method defined by the step which displays only the search results permitted under said filtering rule to 
each thing or group to a user. 

(7) The step which corrects (d) contents engine scanning target to a contents cache storage rather than an 
information site/URL list, (e) A method given in the above (6) which has further a step which traverses 
URL / contents / document tree of said cash and a blocking engine by API, a database scan, and shared 
file scan. 
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(8) Method given in the above (6) which has further a step which corrects said search engine contents 
scan and an indexing process so that it may be constituted like the (f) end user's browser. 

(9) Method given in the above (6) which has further a step which corrects said contents indexing search 
engine so that said cash may be passed, when building (g) indexing database. 

(10) and (h) - a method given in the above (6) which has further a step which corrects said contents 
indexing search engine in order to build an indexing database by searching said cash. 

(11) - (i) - the step which connects said contents indexing search engine to an internal network, and (j) - 
a method given in the above (6) which has further a step which connects said contents indexing search 
engine to internal network operation. (12) A method given in the above (6) which has further a step which 
connects said contents indexing search engine to an external network, and gives compatibility to an 
organization contents blocking policy. 



[Translation done.] 
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2.**** shows the word which can not be translated. 
3.ln the drawings, any words are not translated. 



DESCRIPTION OF DRAWINGS 
[Brief Description of the Drawings] 

[Drawing 1] lt is a block diagram for explaining the composition of an information retrieval system based on 
this invention. 

[Drawing 2] lt is a table of the contents blocking rule performed in the information retrieval system shown in 

drawing 1 . 

[Drawing 3] lt is a flow chart for explaining operation of the system of drawing 1 in the 1st example of an 
embodiment that performs the blocking policy between content retrieval and an indexing phase. 
[Drawing 4] lt is a flow chart for explaining operation of the system of drawing 1 in the 2nd example of an 
embodiment that performs a blocking policy during the phase which displays an end user's search results. 
[Drawing 5] lt is a flow chart for explaining the search engine of drawing 1 in the 3rd example of an 
embodiment. 

[Drawing 6] lt is a flow chart for explaining the search engine of drawing 1 in the 4th example of an 

embodiment. 

[Description of Notations] 

100 Information retrieval system 

102 Client device 

104 Client device 

106 Internet 

107 Network 

110 Display 

111 Keyboard 
112CPU 

1 1 3 Memory 

115 Network connection nature I/O device 

116 Browser 

1 1 7 Operating system 

118 Application program 
120 Contents server 
122 Database 
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124 Gateway 

126 Cash and the filtering engine of contents 

130 Search engine server 

131 Database 

135 Search engine server 

136 Database 

200 Table 

201 Line 

203 Identifier 

204 A user or group ID 

205 The list of keywords to blocking 
207 PICS rule 

209 Black list 
211 White list 
300 Process 
400 Process 
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